Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 1991 |
| Missing cells | 992 |
| Missing cells (%) | 3.8% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 202.3 KiB |
| Average record size in memory | 104.1 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 3 |
LOCATIONS has a high cardinality: 666 distinct values | High cardinality |
STATE has a high cardinality: 203 distinct values | High cardinality |
year is highly overall correlated with Potability | High correlation |
Potability is highly overall correlated with year | High correlation |
STATION CODE has 122 (6.1%) missing values | Missing |
ph has 301 (15.1%) missing values | Missing |
Sulfate has 469 (23.6%) missing values | Missing |
Trihalomethanes has 100 (5.0%) missing values | Missing |
Hardness has unique values | Unique |
Solids has unique values | Unique |
Chloramines has unique values | Unique |
Organic_carbon has unique values | Unique |
Turbidity has unique values | Unique |
Reproduction
| Analysis started | 2023-05-15 09:54:37.382604 |
|---|---|
| Analysis finished | 2023-05-15 09:55:01.843360 |
| Duration | 24.46 seconds |
| Software version | pandas-profiling v3.6.6 |
| Download configuration | config.json |
STATION CODE
Real number (ℝ)
| Distinct | 320 |
|---|---|
| Distinct (%) | 17.1% |
| Missing | 122 |
| Missing (%) | 6.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1953.6447 |
| Minimum | 2 |
|---|---|
| Maximum | 3473 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 1026 |
| Q1 | 1448 |
| median | 1861 |
| Q3 | 2423 |
| 95-th percentile | 3362 |
| Maximum | 3473 |
| Range | 3471 |
| Interquartile range (IQR) | 975 |
Descriptive statistics
| Standard deviation | 744.95769 |
|---|---|
| Coefficient of variation (CV) | 0.38131687 |
| Kurtosis | 0.045686455 |
| Mean | 1953.6447 |
| Median Absolute Deviation (MAD) | 446 |
| Skewness | 0.014082885 |
| Sum | 3651362 |
| Variance | 554961.96 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1573 | 10 | 0.5% |
| 1571 | 10 | 0.5% |
| 1643 | 10 | 0.5% |
| 42 | 10 | 0.5% |
| 1450 | 10 | 0.5% |
| 1399 | 10 | 0.5% |
| 1566 | 10 | 0.5% |
| 1564 | 10 | 0.5% |
| 1159 | 10 | 0.5% |
| 1151 | 10 | 0.5% |
| Other values (310) | 1769 | |
| (Missing) | 122 | 6.1% |
| Value | Count | Frequency (%) |
| 2 | 1 | 0.1% |
| 17 | 10 | |
| 18 | 10 | |
| 20 | 10 | |
| 21 | 10 | |
| 42 | 10 | |
| 43 | 10 | |
| 1023 | 9 | |
| 1024 | 9 | |
| 1025 | 9 |
| Value | Count | Frequency (%) |
| 3473 | 3 | |
| 3471 | 3 | |
| 3468 | 3 | |
| 3466 | 3 | |
| 3465 | 3 | |
| 3464 | 3 | |
| 3460 | 3 | |
| 3459 | 3 | |
| 3458 | 3 | |
| 3384 | 3 |
LOCATIONS
Categorical
| Distinct | 666 |
|---|---|
| Distinct (%) | 33.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.7 KiB |
| NAN | |
|---|---|
| GAUTAMINANGODAVARI RIVER | 10 |
| ZUARI AT PANCHAWADI | 10 |
| PERIYAR AT SEWAGE DISCHARGE POINT, KERALA | 8 |
| TUIRIAL LOWER CATCHMENT | 8 |
| Other values (661) |
Length
| Max length | 110 |
|---|---|
| Median length | 77 |
| Mean length | 33.595178 |
| Min length | 3 |
Characters and Unicode
| Total characters | 66888 |
|---|---|
| Distinct characters | 52 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 305 ? |
|---|---|
| Unique (%) | 15.3% |
Sample
| 1st row | DAMANGANGA AT D/S OF MADHUBAN, DAMAN |
|---|---|
| 2nd row | ZUARI AT D/S OF PT. WHERE KUMBARJRIA CANAL JOINS, GOA |
| 3rd row | ZUARI AT PANCHAWADI |
| 4th row | RIVER ZUARI AT BORIM BRIDGE |
| 5th row | RIVER ZUARI AT MARCAIM JETTY |
Common Values
| Value | Count | Frequency (%) |
| NAN | 184 | 9.2% |
| GAUTAMINANGODAVARI RIVER | 10 | 0.5% |
| ZUARI AT PANCHAWADI | 10 | 0.5% |
| PERIYAR AT SEWAGE DISCHARGE POINT, KERALA | 8 | 0.4% |
| TUIRIAL LOWER CATCHMENT | 8 | 0.4% |
| KALU AT ATALE VILLAGE, MAHARASHTRA | 8 | 0.4% |
| TUIRIAL UPPER CATCHMENT | 8 | 0.4% |
| KUNDALIKA AT ROHA CITY, MAHARASHTRA | 8 | 0.4% |
| TLAWNG DOWNSTREAM AIZAWL | 8 | 0.4% |
| TLAWNG UPSTREAM AIZAWL | 8 | 0.4% |
| Other values (656) | 1731 |
Length
| Value | Count | Frequency (%) |
| at | 1353 | 13.6% |
| river | 535 | 5.4% |
| of | 243 | 2.4% |
| d/s | 222 | 2.2% |
| nan | 221 | 2.2% |
| kerala | 211 | 2.1% |
| r | 203 | 2.0% |
| bridge | 135 | 1.4% |
| near | 128 | 1.3% |
| u/s | 123 | 1.2% |
| Other values (765) | 6611 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 11551 | |
| 8079 | 12.1% | |
| R | 5121 | 7.7% |
| I | 3859 | 5.8% |
| N | 3640 | 5.4% |
| T | 3607 | 5.4% |
| E | 2818 | 4.2% |
| L | 2802 | 4.2% |
| H | 2368 | 3.5% |
| U | 2152 | 3.2% |
| Other values (42) | 20891 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 56118 | |
| Space Separator | 8079 | 12.1% |
| Other Punctuation | 2309 | 3.5% |
| Open Punctuation | 135 | 0.2% |
| Close Punctuation | 129 | 0.2% |
| Decimal Number | 96 | 0.1% |
| Lowercase Letter | 22 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 11551 | |
| R | 5121 | 9.1% |
| I | 3859 | 6.9% |
| N | 3640 | 6.5% |
| T | 3607 | 6.4% |
| E | 2818 | 5.0% |
| L | 2802 | 5.0% |
| H | 2368 | 4.2% |
| U | 2152 | 3.8% |
| M | 2102 | 3.7% |
| Other values (16) | 16098 |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 5 | |
| e | 3 | |
| r | 3 | |
| s | 3 | |
| t | 2 | 9.1% |
| o | 1 | 4.5% |
| f | 1 | 4.5% |
| u | 1 | 4.5% |
| v | 1 | 4.5% |
| g | 1 | 4.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 24 | |
| 2 | 19 | |
| 0 | 18 | |
| 3 | 13 | |
| 8 | 9 | 9.4% |
| 6 | 6 | 6.2% |
| 9 | 6 | 6.2% |
| 5 | 1 | 1.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 1535 | |
| / | 395 | 17.1% |
| . | 364 | 15.8% |
| \ | 15 | 0.6% |
Space Separator
| Value | Count | Frequency (%) |
| 8079 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 135 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 129 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 56140 | |
| Common | 10748 | 16.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 11551 | |
| R | 5121 | 9.1% |
| I | 3859 | 6.9% |
| N | 3640 | 6.5% |
| T | 3607 | 6.4% |
| E | 2818 | 5.0% |
| L | 2802 | 5.0% |
| H | 2368 | 4.2% |
| U | 2152 | 3.8% |
| M | 2102 | 3.7% |
| Other values (27) | 16120 |
Common
| Value | Count | Frequency (%) |
| 8079 | ||
| , | 1535 | 14.3% |
| / | 395 | 3.7% |
| . | 364 | 3.4% |
| ( | 135 | 1.3% |
| ) | 129 | 1.2% |
| 1 | 24 | 0.2% |
| 2 | 19 | 0.2% |
| 0 | 18 | 0.2% |
| \ | 15 | 0.1% |
| Other values (5) | 35 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 66888 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 11551 | |
| 8079 | 12.1% | |
| R | 5121 | 7.7% |
| I | 3859 | 5.8% |
| N | 3640 | 5.4% |
| T | 3607 | 5.4% |
| E | 2818 | 4.2% |
| L | 2802 | 4.2% |
| H | 2368 | 3.5% |
| U | 2152 | 3.2% |
| Other values (42) | 20891 |
STATE
Categorical
| Distinct | 203 |
|---|---|
| Distinct (%) | 10.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.7 KiB |
| NAN | |
|---|---|
| KERALA | |
| MAHARASHTRA | |
| MEGHALAYA | |
| GOA | |
| Other values (198) |
Length
| Max length | 93 |
|---|---|
| Median length | 79 |
| Mean length | 8.4590658 |
| Min length | 3 |
Characters and Unicode
| Total characters | 16842 |
|---|---|
| Distinct characters | 49 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 177 ? |
|---|---|
| Unique (%) | 8.9% |
Sample
| 1st row | DAMAN & DIU |
|---|---|
| 2nd row | GOA |
| 3rd row | GOA |
| 4th row | GOA |
| 5th row | GOA |
Common Values
| Value | Count | Frequency (%) |
| NAN | 761 | |
| KERALA | 275 | 13.8% |
| MAHARASHTRA | 142 | 7.1% |
| MEGHALAYA | 125 | 6.3% |
| GOA | 101 | 5.1% |
| MANIPUR | 76 | 3.8% |
| PUNJAB | 48 | 2.4% |
| TAMILNADU | 42 | 2.1% |
| GUJARAT | 37 | 1.9% |
| ORISSA | 30 | 1.5% |
| Other values (193) | 354 |
Length
| Value | Count | Frequency (%) |
| nan | 765 | |
| kerala | 303 | 10.7% |
| maharashtra | 149 | 5.3% |
| meghalaya | 129 | 4.6% |
| at | 124 | 4.4% |
| goa | 110 | 3.9% |
| manipur | 84 | 3.0% |
| punjab | 58 | 2.1% |
| tamilnadu | 46 | 1.6% |
| gujarat | 39 | 1.4% |
| Other values (429) | 1020 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 4212 | |
| N | 2127 | |
| R | 1418 | 8.4% |
| 842 | 5.0% | |
| H | 775 | 4.6% |
| E | 729 | 4.3% |
| L | 728 | 4.3% |
| M | 682 | 4.0% |
| T | 600 | 3.6% |
| I | 591 | 3.5% |
| Other values (39) | 4138 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 15676 | |
| Space Separator | 842 | 5.0% |
| Other Punctuation | 248 | 1.5% |
| Lowercase Letter | 42 | 0.2% |
| Close Punctuation | 14 | 0.1% |
| Open Punctuation | 14 | 0.1% |
| Decimal Number | 6 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 4212 | |
| N | 2127 | |
| R | 1418 | 9.0% |
| H | 775 | 4.9% |
| E | 729 | 4.7% |
| L | 728 | 4.6% |
| M | 682 | 4.4% |
| T | 600 | 3.8% |
| I | 591 | 3.8% |
| G | 462 | 2.9% |
| Other values (15) | 3352 |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 10 | |
| r | 8 | |
| p | 4 | 9.5% |
| h | 3 | 7.1% |
| u | 3 | 7.1% |
| t | 3 | 7.1% |
| i | 3 | 7.1% |
| s | 2 | 4.8% |
| d | 2 | 4.8% |
| n | 2 | 4.8% |
| Other values (2) | 2 | 4.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 155 | |
| . | 40 | 16.1% |
| / | 39 | 15.7% |
| & | 12 | 4.8% |
| \ | 2 | 0.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2 | |
| 0 | 2 | |
| 2 | 1 | |
| 8 | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 842 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 14 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 14 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 15718 | |
| Common | 1124 | 6.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 4212 | |
| N | 2127 | |
| R | 1418 | 9.0% |
| H | 775 | 4.9% |
| E | 729 | 4.6% |
| L | 728 | 4.6% |
| M | 682 | 4.3% |
| T | 600 | 3.8% |
| I | 591 | 3.8% |
| G | 462 | 2.9% |
| Other values (27) | 3394 |
Common
| Value | Count | Frequency (%) |
| 842 | ||
| , | 155 | 13.8% |
| . | 40 | 3.6% |
| / | 39 | 3.5% |
| ) | 14 | 1.2% |
| ( | 14 | 1.2% |
| & | 12 | 1.1% |
| 1 | 2 | 0.2% |
| 0 | 2 | 0.2% |
| \ | 2 | 0.2% |
| Other values (2) | 2 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16842 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 4212 | |
| N | 2127 | |
| R | 1418 | 8.4% |
| 842 | 5.0% | |
| H | 775 | 4.6% |
| E | 729 | 4.3% |
| L | 728 | 4.3% |
| M | 682 | 4.0% |
| T | 600 | 3.6% |
| I | 591 | 3.5% |
| Other values (39) | 4138 |
year
Real number (ℝ)
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2010.0382 |
| Minimum | 2003 |
|---|---|
| Maximum | 2014 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 2003 |
|---|---|
| 5-th percentile | 2005 |
| Q1 | 2008 |
| median | 2011 |
| Q3 | 2013 |
| 95-th percentile | 2014 |
| Maximum | 2014 |
| Range | 11 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.0573331 |
|---|---|
| Coefficient of variation (CV) | 0.0015210324 |
| Kurtosis | -0.57394271 |
| Mean | 2010.0382 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.59388594 |
| Sum | 4001986 |
| Variance | 9.3472859 |
| Monotonicity | Decreasing |
| Value | Count | Frequency (%) |
| 2012 | 292 | |
| 2013 | 261 | |
| 2014 | 245 | |
| 2011 | 231 | |
| 2010 | 188 | |
| 2009 | 181 | |
| 2008 | 159 | |
| 2007 | 120 | |
| 2005 | 119 | |
| 2006 | 105 | 5.3% |
| Other values (2) | 90 | 4.5% |
| Value | Count | Frequency (%) |
| 2003 | 88 | 4.4% |
| 2004 | 2 | 0.1% |
| 2005 | 119 | |
| 2006 | 105 | 5.3% |
| 2007 | 120 | |
| 2008 | 159 | |
| 2009 | 181 | |
| 2010 | 188 | |
| 2011 | 231 | |
| 2012 | 292 |
| Value | Count | Frequency (%) |
| 2014 | 245 | |
| 2013 | 261 | |
| 2012 | 292 | |
| 2011 | 231 | |
| 2010 | 188 | |
| 2009 | 181 | |
| 2008 | 159 | |
| 2007 | 120 | |
| 2006 | 105 | 5.3% |
| 2005 | 119 |
ph
Real number (ℝ)
| Distinct | 1690 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 301 |
| Missing (%) | 15.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.0850939 |
| Minimum | 0.22749905 |
|---|---|
| Maximum | 13.175402 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 0.22749905 |
|---|---|
| 5-th percentile | 4.6936904 |
| Q1 | 6.145172 |
| median | 7.0323289 |
| Q3 | 8.0072492 |
| 95-th percentile | 9.6427091 |
| Maximum | 13.175402 |
| Range | 12.947903 |
| Interquartile range (IQR) | 1.8620772 |
Descriptive statistics
| Standard deviation | 1.5091002 |
|---|---|
| Coefficient of variation (CV) | 0.2129965 |
| Kurtosis | 0.61768791 |
| Mean | 7.0850939 |
| Median Absolute Deviation (MAD) | 0.92829066 |
| Skewness | 0.0055431689 |
| Sum | 11973.809 |
| Variance | 2.2773834 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.909640554 | 1 | 0.1% |
| 6.652824348 | 1 | 0.1% |
| 8.070477127 | 1 | 0.1% |
| 6.584813111 | 1 | 0.1% |
| 5.742533063 | 1 | 0.1% |
| 5.343075103 | 1 | 0.1% |
| 6.057068041 | 1 | 0.1% |
| 8.132736965 | 1 | 0.1% |
| 7.021295306 | 1 | 0.1% |
| 5.918953546 | 1 | 0.1% |
| Other values (1680) | 1680 | |
| (Missing) | 301 | 15.1% |
| Value | Count | Frequency (%) |
| 0.2274990502 | 1 | |
| 0.9899122129 | 1 | |
| 1.757037115 | 1 | |
| 1.844538366 | 1 | |
| 2.569243562 | 1 | |
| 2.612035915 | 1 | |
| 2.69083124 | 1 | |
| 2.798549099 | 1 | |
| 3.344588533 | 1 | |
| 3.388090611 | 1 |
| Value | Count | Frequency (%) |
| 13.17540172 | 1 | |
| 12.24692807 | 1 | |
| 11.89807803 | 1 | |
| 11.53488049 | 1 | |
| 11.301794 | 1 | |
| 11.26782838 | 1 | |
| 11.24450714 | 1 | |
| 11.18069466 | 1 | |
| 11.18028447 | 1 | |
| 11.02787986 | 1 |
Hardness
Real number (ℝ)
| Distinct | 1991 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 196.64565 |
| Minimum | 47.432 |
|---|---|
| Maximum | 323.124 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 47.432 |
|---|---|
| 5-th percentile | 144.95655 |
| Q1 | 177.22372 |
| median | 197.46047 |
| Q3 | 215.78655 |
| 95-th percentile | 248.84627 |
| Maximum | 323.124 |
| Range | 275.692 |
| Interquartile range (IQR) | 38.562829 |
Descriptive statistics
| Standard deviation | 32.211865 |
|---|---|
| Coefficient of variation (CV) | 0.16380665 |
| Kurtosis | 0.99974932 |
| Mean | 196.64565 |
| Median Absolute Deviation (MAD) | 19.22681 |
| Skewness | -0.02916238 |
| Sum | 391521.49 |
| Variance | 1037.6042 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 204.8904555 | 1 | 0.1% |
| 196.7827837 | 1 | 0.1% |
| 198.8659477 | 1 | 0.1% |
| 182.3754558 | 1 | 0.1% |
| 182.941032 | 1 | 0.1% |
| 152.5076794 | 1 | 0.1% |
| 211.6620911 | 1 | 0.1% |
| 184.3732318 | 1 | 0.1% |
| 206.786448 | 1 | 0.1% |
| 189.814682 | 1 | 0.1% |
| Other values (1981) | 1981 |
| Value | Count | Frequency (%) |
| 47.432 | 1 | |
| 73.49223369 | 1 | |
| 77.4595861 | 1 | |
| 81.71089527 | 1 | |
| 94.09130748 | 1 | |
| 97.2809086 | 1 | |
| 98.45293051 | 1 | |
| 98.77164353 | 1 | |
| 100.4576151 | 1 | |
| 103.173587 | 1 |
| Value | Count | Frequency (%) |
| 323.124 | 1 | |
| 311.3839565 | 1 | |
| 308.2538329 | 1 | |
| 307.7060241 | 1 | |
| 306.6274814 | 1 | |
| 304.2359121 | 1 | |
| 300.2924758 | 1 | |
| 291.4618974 | 1 | |
| 287.3702082 | 1 | |
| 286.2017633 | 1 |
Solids
Real number (ℝ)
| Distinct | 1991 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22033.523 |
| Minimum | 1372.091 |
|---|---|
| Maximum | 56867.859 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 1372.091 |
|---|---|
| 5-th percentile | 9194.5327 |
| Q1 | 15472.588 |
| median | 20920.252 |
| Q3 | 27419.474 |
| 95-th percentile | 38873.693 |
| Maximum | 56867.859 |
| Range | 55495.768 |
| Interquartile range (IQR) | 11946.886 |
Descriptive statistics
| Standard deviation | 8951.8752 |
|---|---|
| Coefficient of variation (CV) | 0.40628434 |
| Kurtosis | 0.26585929 |
| Mean | 22033.523 |
| Median Absolute Deviation (MAD) | 5948.3025 |
| Skewness | 0.59887461 |
| Sum | 43868743 |
| Variance | 80136069 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20791.31898 | 1 | 0.1% |
| 19024.68867 | 1 | 0.1% |
| 18266.61772 | 1 | 0.1% |
| 24723.1063 | 1 | 0.1% |
| 21293.88975 | 1 | 0.1% |
| 11398.7311 | 1 | 0.1% |
| 45166.91214 | 1 | 0.1% |
| 14807.26849 | 1 | 0.1% |
| 25838.12848 | 1 | 0.1% |
| 19887.76983 | 1 | 0.1% |
| Other values (1981) | 1981 |
| Value | Count | Frequency (%) |
| 1372.091043 | 1 | |
| 2552.962804 | 1 | |
| 2808.025756 | 1 | |
| 2912.211247 | 1 | |
| 3773.281147 | 1 | |
| 3802.411681 | 1 | |
| 3900.913892 | 1 | |
| 4111.785432 | 1 | |
| 4142.499001 | 1 | |
| 4168.196994 | 1 |
| Value | Count | Frequency (%) |
| 56867.85924 | 1 | |
| 56488.67241 | 1 | |
| 56351.3963 | 1 | |
| 55334.7028 | 1 | |
| 52318.9173 | 1 | |
| 52060.2268 | 1 | |
| 50279.26243 | 1 | |
| 49125.36008 | 1 | |
| 49074.73041 | 1 | |
| 49009.92466 | 1 |
Chloramines
Real number (ℝ)
| Distinct | 1991 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.0891025 |
| Minimum | 0.53035129 |
|---|---|
| Maximum | 13.127 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 0.53035129 |
|---|---|
| 5-th percentile | 4.4980086 |
| Q1 | 6.0330555 |
| median | 7.0891462 |
| Q3 | 8.122656 |
| 95-th percentile | 9.7641521 |
| Maximum | 13.127 |
| Range | 12.596649 |
| Interquartile range (IQR) | 2.0896005 |
Descriptive statistics
| Standard deviation | 1.5978603 |
|---|---|
| Coefficient of variation (CV) | 0.2253967 |
| Kurtosis | 0.28544938 |
| Mean | 7.0891025 |
| Median Absolute Deviation (MAD) | 1.048411 |
| Skewness | 0.057570783 |
| Sum | 14114.403 |
| Variance | 2.5531576 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.300211873 | 1 | 0.1% |
| 6.911867556 | 1 | 0.1% |
| 6.90236983 | 1 | 0.1% |
| 6.238919727 | 1 | 0.1% |
| 6.826412756 | 1 | 0.1% |
| 8.728973148 | 1 | 0.1% |
| 6.651801292 | 1 | 0.1% |
| 5.753405052 | 1 | 0.1% |
| 8.684832598 | 1 | 0.1% |
| 8.115767881 | 1 | 0.1% |
| Other values (1981) | 1981 |
| Value | Count | Frequency (%) |
| 0.5303512947 | 1 | |
| 1.683992581 | 1 | |
| 2.456013596 | 1 | |
| 2.484379977 | 1 | |
| 2.577555273 | 1 | |
| 2.621267556 | 1 | |
| 2.741712117 | 1 | |
| 2.750837309 | 1 | |
| 2.862535374 | 1 | |
| 2.86607303 | 1 |
| Value | Count | Frequency (%) |
| 13.127 | 1 | |
| 13.04380611 | 1 | |
| 12.91218664 | 1 | |
| 12.58002649 | 1 | |
| 12.36328483 | 1 | |
| 12.27937418 | 1 | |
| 12.0625362 | 1 | |
| 11.58615108 | 1 | |
| 11.54319047 | 1 | |
| 11.52359751 | 1 |
Sulfate
Real number (ℝ)
| Distinct | 1522 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 469 |
| Missing (%) | 23.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 332.24427 |
| Minimum | 129 |
|---|---|
| Maximum | 476.53972 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 129 |
|---|---|
| 5-th percentile | 261.21604 |
| Q1 | 306.26073 |
| median | 331.53619 |
| Q3 | 358.24268 |
| 95-th percentile | 400.69067 |
| Maximum | 476.53972 |
| Range | 347.53972 |
| Interquartile range (IQR) | 51.981954 |
Descriptive statistics
| Standard deviation | 42.539915 |
|---|---|
| Coefficient of variation (CV) | 0.12803807 |
| Kurtosis | 0.84516993 |
| Mean | 332.24427 |
| Median Absolute Deviation (MAD) | 25.820545 |
| Skewness | -0.092040178 |
| Sum | 505675.77 |
| Variance | 1809.6444 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 376.811708 | 1 | 0.1% |
| 396.6195096 | 1 | 0.1% |
| 306.5430715 | 1 | 0.1% |
| 400.6696015 | 1 | 0.1% |
| 279.7674997 | 1 | 0.1% |
| 384.8219665 | 1 | 0.1% |
| 346.6118493 | 1 | 0.1% |
| 415.9278983 | 1 | 0.1% |
| 320.535989 | 1 | 0.1% |
| 321.9745697 | 1 | 0.1% |
| Other values (1512) | 1512 | |
| (Missing) | 469 | 23.6% |
| Value | Count | Frequency (%) |
| 129 | 1 | |
| 180.2067464 | 1 | |
| 182.3973702 | 1 | |
| 187.1707144 | 1 | |
| 187.4241309 | 1 | |
| 192.0335917 | 1 | |
| 203.4445208 | 1 | |
| 206.2472294 | 1 | |
| 207.8904823 | 1 | |
| 209.4710584 | 1 |
| Value | Count | Frequency (%) |
| 476.5397173 | 1 | |
| 475.7374602 | 1 | |
| 462.474215 | 1 | |
| 460.107069 | 1 | |
| 458.4410723 | 1 | |
| 455.4512337 | 1 | |
| 449.2676875 | 1 | |
| 445.9383912 | 1 | |
| 445.3595467 | 1 | |
| 444.970552 | 1 |
Organic_carbon
Real number (ℝ)
| Distinct | 1991 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.242741 |
| Minimum | 2.2 |
|---|---|
| Maximum | 28.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 2.2 |
|---|---|
| 5-th percentile | 8.8142365 |
| Q1 | 12.001863 |
| median | 14.189062 |
| Q3 | 16.543395 |
| 95-th percentile | 19.585249 |
| Maximum | 28.3 |
| Range | 26.1 |
| Interquartile range (IQR) | 4.5415326 |
Descriptive statistics
| Standard deviation | 3.3167334 |
|---|---|
| Coefficient of variation (CV) | 0.23287185 |
| Kurtosis | 0.056122114 |
| Mean | 14.242741 |
| Median Absolute Deviation (MAD) | 2.2717747 |
| Skewness | -0.0030376768 |
| Sum | 28357.298 |
| Variance | 11.000721 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10.37978308 | 1 | 0.1% |
| 10.77286174 | 1 | 0.1% |
| 10.92446056 | 1 | 0.1% |
| 17.58261495 | 1 | 0.1% |
| 11.14407224 | 1 | 0.1% |
| 11.67660134 | 1 | 0.1% |
| 19.68233697 | 1 | 0.1% |
| 14.7530554 | 1 | 0.1% |
| 11.63301523 | 1 | 0.1% |
| 12.95891702 | 1 | 0.1% |
| Other values (1981) | 1981 |
| Value | Count | Frequency (%) |
| 2.2 | 1 | |
| 4.371898608 | 1 | |
| 4.473092264 | 1 | |
| 4.861631498 | 1 | |
| 4.966861619 | 1 | |
| 5.051694615 | 1 | |
| 5.218232927 | 1 | |
| 5.315286537 | 1 | |
| 5.362370906 | 1 | |
| 5.512039718 | 1 |
| Value | Count | Frequency (%) |
| 28.3 | 1 | |
| 23.95245044 | 1 | |
| 23.91760126 | 1 | |
| 23.56964491 | 1 | |
| 23.51477377 | 1 | |
| 23.39951606 | 1 | |
| 23.37326504 | 1 | |
| 23.31769912 | 1 | |
| 23.23432591 | 1 | |
| 23.13595214 | 1 |
Trihalomethanes
Real number (ℝ)
| Distinct | 1891 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 100 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 66.816794 |
| Minimum | 8.1758764 |
|---|---|
| Maximum | 120.03008 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 8.1758764 |
|---|---|
| 5-th percentile | 39.552362 |
| Q1 | 56.584099 |
| median | 66.701621 |
| Q3 | 77.766434 |
| 95-th percentile | 92.404823 |
| Maximum | 120.03008 |
| Range | 111.8542 |
| Interquartile range (IQR) | 21.182335 |
Descriptive statistics
| Standard deviation | 16.307375 |
|---|---|
| Coefficient of variation (CV) | 0.24406103 |
| Kurtosis | 0.23289937 |
| Mean | 66.816794 |
| Median Absolute Deviation (MAD) | 10.686517 |
| Skewness | -0.12898369 |
| Sum | 126350.56 |
| Variance | 265.93049 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 86.99097046 | 1 | 0.1% |
| 66.35965777 | 1 | 0.1% |
| 68.61239104 | 1 | 0.1% |
| 58.60094012 | 1 | 0.1% |
| 70.54686225 | 1 | 0.1% |
| 34.26586026 | 1 | 0.1% |
| 70.93748133 | 1 | 0.1% |
| 77.3399182 | 1 | 0.1% |
| 65.88233542 | 1 | 0.1% |
| 80.65197814 | 1 | 0.1% |
| Other values (1881) | 1881 | |
| (Missing) | 100 | 5.0% |
| Value | Count | Frequency (%) |
| 8.175876384 | 1 | |
| 8.577012933 | 1 | |
| 16.2915046 | 1 | |
| 17.00068293 | 1 | |
| 17.52776496 | 1 | |
| 17.91572257 | 1 | |
| 18.10122217 | 1 | |
| 18.40001219 | 1 | |
| 19.17517454 | 1 | |
| 20.33775264 | 1 |
| Value | Count | Frequency (%) |
| 120.030077 | 1 | |
| 118.3572747 | 1 | |
| 116.1616216 | 1 | |
| 114.2086714 | 1 | |
| 113.0488857 | 1 | |
| 112.622733 | 1 | |
| 110.7392993 | 1 | |
| 110.4310803 | 1 | |
| 108.849568 | 1 | |
| 108.5894144 | 1 |
Turbidity
Real number (ℝ)
| Distinct | 1991 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9495801 |
| Minimum | 1.4922066 |
|---|---|
| Maximum | 6.739 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.7 KiB |
Quantile statistics
| Minimum | 1.4922066 |
|---|---|
| 5-th percentile | 2.6441244 |
| Q1 | 3.4077591 |
| median | 3.9402823 |
| Q3 | 4.4899879 |
| 95-th percentile | 5.2287093 |
| Maximum | 6.739 |
| Range | 5.2467934 |
| Interquartile range (IQR) | 1.0822288 |
Descriptive statistics
| Standard deviation | 0.78699403 |
|---|---|
| Coefficient of variation (CV) | 0.19926018 |
| Kurtosis | -0.10490908 |
| Mean | 3.9495801 |
| Median Absolute Deviation (MAD) | 0.54168975 |
| Skewness | 0.010736876 |
| Sum | 7863.614 |
| Variance | 0.61935961 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.963135381 | 1 | 0.1% |
| 2.728799592 | 1 | 0.1% |
| 3.055790475 | 1 | 0.1% |
| 4.404132196 | 1 | 0.1% |
| 4.272202758 | 1 | 0.1% |
| 5.444927204 | 1 | 0.1% |
| 4.240031708 | 1 | 0.1% |
| 4.371747854 | 1 | 0.1% |
| 2.969433921 | 1 | 0.1% |
| 4.893750992 | 1 | 0.1% |
| Other values (1981) | 1981 |
| Value | Count | Frequency (%) |
| 1.492206615 | 1 | |
| 1.496100943 | 1 | |
| 1.659799385 | 1 | |
| 1.680554025 | 1 | |
| 1.687624505 | 1 | |
| 1.81252894 | 1 | |
| 1.943318777 | 1 | |
| 1.964863097 | 1 | |
| 1.986191593 | 1 | |
| 2.000757032 | 1 |
| Value | Count | Frequency (%) |
| 6.739 | 1 | |
| 6.494249467 | 1 | |
| 6.389161009 | 1 | |
| 6.35743852 | 1 | |
| 6.204846359 | 1 | |
| 6.073006014 | 1 | |
| 6.06455925 | 1 | |
| 6.038184953 | 1 | |
| 6.032994877 | 1 | |
| 5.989542791 | 1 |
Potability
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.7 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1991 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1250 | |
| 1 | 741 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1250 | |
| 1 | 741 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1250 | |
| 1 | 741 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1991 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1250 | |
| 1 | 741 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1991 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1250 | |
| 1 | 741 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1991 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1250 | |
| 1 | 741 |
| STATION CODE | year | ph | Hardness | Solids | Chloramines | Sulfate | Organic_carbon | Trihalomethanes | Turbidity | Potability | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| STATION CODE | 1.000 | 0.178 | 0.017 | -0.040 | 0.008 | -0.007 | 0.023 | -0.026 | -0.012 | 0.010 | 0.000 |
| year | 0.178 | 1.000 | 0.004 | 0.055 | -0.008 | 0.021 | -0.033 | -0.003 | -0.031 | 0.021 | 0.552 |
| ph | 0.017 | 0.004 | 1.000 | 0.026 | -0.153 | 0.002 | 0.165 | 0.048 | 0.031 | -0.064 | 0.078 |
| Hardness | -0.040 | 0.055 | 0.026 | 1.000 | -0.059 | 0.186 | -0.027 | -0.015 | 0.002 | -0.024 | 0.074 |
| Solids | 0.008 | -0.008 | -0.153 | -0.059 | 1.000 | -0.086 | -0.126 | 0.025 | -0.012 | 0.057 | 0.037 |
| Chloramines | -0.007 | 0.021 | 0.002 | 0.186 | -0.086 | 1.000 | 0.062 | -0.017 | 0.008 | -0.021 | 0.080 |
| Sulfate | 0.023 | -0.033 | 0.165 | -0.027 | -0.126 | 0.062 | 1.000 | 0.025 | -0.045 | -0.031 | 0.203 |
| Organic_carbon | -0.026 | -0.003 | 0.048 | -0.015 | 0.025 | -0.017 | 0.025 | 1.000 | -0.034 | -0.000 | 0.029 |
| Trihalomethanes | -0.012 | -0.031 | 0.031 | 0.002 | -0.012 | 0.008 | -0.045 | -0.034 | 1.000 | -0.045 | 0.000 |
| Turbidity | 0.010 | 0.021 | -0.064 | -0.024 | 0.057 | -0.021 | -0.031 | -0.000 | -0.045 | 1.000 | 0.000 |
| Potability | 0.000 | 0.552 | 0.078 | 0.074 | 0.037 | 0.080 | 0.203 | 0.029 | 0.000 | 0.000 | 1.000 |
| STATION CODE | LOCATIONS | STATE | year | ph | Hardness | Solids | Chloramines | Sulfate | Organic_carbon | Trihalomethanes | Turbidity | Potability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1393 | DAMANGANGA AT D/S OF MADHUBAN, DAMAN | DAMAN & DIU | 2014 | NaN | 204.890455 | 20791.318981 | 7.300212 | 368.516441 | 10.379783 | 86.990970 | 2.963135 | 0 |
| 1 | 1399 | ZUARI AT D/S OF PT. WHERE KUMBARJRIA CANAL JOINS, GOA | GOA | 2014 | 3.716080 | 129.422921 | 18630.057858 | 6.635246 | NaN | 15.180013 | 56.329076 | 4.500656 | 0 |
| 2 | 1475 | ZUARI AT PANCHAWADI | GOA | 2014 | 8.099124 | 224.236259 | 19909.541732 | 9.275884 | NaN | 16.868637 | 66.420093 | 3.055934 | 0 |
| 3 | 3181 | RIVER ZUARI AT BORIM BRIDGE | GOA | 2014 | 8.316766 | 214.373394 | 22018.417441 | 8.059332 | 356.886136 | 18.436524 | 100.341674 | 4.628771 | 0 |
| 4 | 3182 | RIVER ZUARI AT MARCAIM JETTY | GOA | 2014 | 9.092223 | 181.101509 | 17978.986339 | 6.546600 | 310.135738 | 11.558279 | 31.997993 | 4.075075 | 0 |
| 5 | 1400 | MANDOVI AT NEGHBOURHOOD OF PANAJI, GOA | GOA | 2014 | 5.584087 | 188.313324 | 28748.687739 | 7.544869 | 326.678363 | 8.399735 | 54.917862 | 2.559708 | 0 |
| 6 | 1476 | MANDOVI AT TONCA, MARCELA, GOA | GOA | 2014 | 10.223862 | 248.071735 | 28749.716544 | 7.513408 | 393.663396 | 13.789695 | 84.603556 | 2.672989 | 0 |
| 7 | 3185 | RIVER MANDOVI AT AMONA BRIDGE | GOA | 2014 | 8.635849 | 203.361523 | 13672.091764 | 4.563009 | 303.309771 | 12.363817 | 62.798309 | 4.401425 | 0 |
| 8 | 3186 | RIVER MANDOVI AT IFFI JETTY | GOA | 2014 | NaN | 118.988579 | 14285.583854 | 7.804174 | 268.646941 | 12.706049 | 53.928846 | 3.595017 | 0 |
| 9 | 3187 | RIVER MANDOVI NEAR HOTEL MARRIOT | GOA | 2014 | 11.180284 | 227.231469 | 25484.508491 | 9.077200 | 404.041635 | 17.927806 | 71.976601 | 4.370562 | 0 |
| STATION CODE | LOCATIONS | STATE | year | ph | Hardness | Solids | Chloramines | Sulfate | Organic_carbon | Trihalomethanes | Turbidity | Potability | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1981 | 1160 | TAMBIRAPARANI AT CHERANMADEVI,CAUSE WAY,TAMILNADU | NAN | 2003 | NaN | 209.751955 | 20214.216552 | 6.045078 | 323.788383 | 20.278990 | 72.735207 | 4.258489 | 1 |
| 1982 | 1161 | TAMBIRAPARANI AT TIRUNELVELI,COLLECTORATE, TAMILNADU. | NAN | 2003 | 7.046549 | 128.482517 | 30569.810551 | 4.449123 | 281.724714 | 15.142006 | 58.157304 | 2.869226 | 1 |
| 1983 | 1162 | TAMBIRAPARANI AT MURAPPANADU, TAMILNADU | NAN | 2003 | NaN | 199.845875 | 12635.367704 | 7.886383 | 332.615154 | 13.217536 | 54.549618 | 4.480574 | 1 |
| 1984 | 1328 | TAMBIRAPARANI AT PAPPANKULAM,TAMILNADU | NAN | 2003 | 7.732880 | 189.509811 | 47022.745845 | 8.226725 | 287.087053 | 14.980054 | 71.206209 | 3.510728 | 1 |
| 1985 | 1329 | TAMBIRAPARANI AT RAIL BDG. NR. AMBASAMUDAM, TAMILNADU | NAN | 2003 | 6.266800 | 187.829617 | 27577.213623 | 9.141597 | 322.917848 | 13.290252 | 59.454325 | 3.652845 | 1 |
| 1986 | 1330 | TAMBIRAPARANI AT ARUMUGANERI, TAMILNADU | NAN | 2003 | 6.630252 | 160.920384 | 22557.779576 | 5.305394 | 338.630828 | 15.793101 | 53.276033 | 5.181202 | 1 |
| 1987 | 1450 | PALAR AT VANIYAMBADI WATER SUPPLY HEAD WORK, TAMILNADU | NAN | 2003 | 6.775631 | 154.372543 | 15525.393963 | 6.084133 | 343.032161 | 17.118543 | 56.124024 | 3.017544 | 1 |
| 1988 | 1403 | GUMTI AT U/S SOUTH TRIPURA,TRIPURA | NAN | 2003 | NaN | 204.737292 | 25680.717388 | 7.980193 | 318.677273 | 20.376838 | 69.020530 | 4.323785 | 1 |
| 1989 | 1404 | GUMTI AT D/S SOUTH TRIPURA, TRIPURA | NAN | 2003 | 8.164992 | 278.340358 | 29045.261138 | 7.992914 | 334.551966 | 17.406626 | 64.210767 | 4.162496 | 1 |
| 1990 | 1726 | CHANDRAPUR, AGARTALA D/S OF HAORA RIVER, TRIPURA | NAN | 2003 | 7.773758 | 251.462844 | 21688.616943 | 6.194910 | 395.088245 | 14.324552 | 67.584311 | 4.040974 | 1 |